


ctf 



Wisc-Ex-99-352 

H. Hu, J. Nielsen 

June 1, 1999 



ON 
=3 



Analytic Confidence Level Calculations using 
B ■ the Likelihood Ratio and Fourier Transform 

03 

V. 

O 

•i-H 

J>^| HONGBO HU AND JASON NIELSEN^ 

i-C | University of Wisconsin- Madison, Wisconsin, USA 



> 

^ . Abstract 

q^ ■ fidence level calculation on either the discovery hypothesis or the 

ON . background-only ("null") hypothesis. A typical approach uses toy 

tt \ Monte Carlo experiments to build an expected experiment estimator 

distribution against which an observed experiment's estimator may 
be compared. In this note, a new approach is presented which calcu- 
Js^ ', lates analytically the experiment estimator distribution via a Fourier 

p^H' transform, using the likelihood ratio as an ordering estimator. The 

analytic approach enjoys an enormous speed advantage over the toy 
k> \ Monte Carlo method, making it possible to quickly and precisely cal- 

?—i ' culate confidence level results. 



The interpretation of new particle search results involves a con- 
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1 Introduction 

A consistently recurring topic at LEP2 has been the interpretation and com- 
bination of results from searches for new particles. The fundamental task is 
to interpret the collected dataset in the context of two complementary hy- 
potheses. The first hypothesis - the null hypothesis - is that the dataset is 
compatible with non-signal Standard Model background production alone, 
and the second is that the dataset is compatible with the sum of signal and 
Standard Model background production. In most cases, the search for new 
particles proceeds via several parallel searches for final states. The results 
from all of these subchannels are then combined to produce a final result. 

All existing confidence level calculations follow the same general strat- 
egy |I], |2|, [|. A test statistic or estimator is constructed to quantify the 
"signal- ness" of a real or simulated experiment. The "signal-ness" of a single 
observed experiment leads to the confidence level on, for example, the null 
hypothesis that the observed experiment is incompatible with signal and 
background both being produced. Most calculation methods use an ensem- 
ble of toy Monte Carlo experiments to generate the estimator distribution 
against which the observed experiment is compared. This generation can be 
rather time-consuming when the number of toy Monte Carlo experiments is 
great (as it must be for high precision calculations) or if the number of signal 
and background expected for each experiment is great (as it is for the case 
of searches optimized to use background subtraction). 

In this note, we present an improved method for calculating confidence 
levels in the context of searches for new particles. Specifically, when the 
likelihood ratio is used as an estimator, the experiment estimator distribu- 
tion may be calculated analytically with the Fourier transform. With this 
approach, the disadvantage of toy Monte Carlo experiments is avoided. The 
analytic method offers several advantages over existing methods, the most 
dramatic of which is the increase in calculation speed and precision. 

2 Likelihood ratio estimator for searches 

The likelihood ratio estimator is the ratio of the probabilities of observing an 
event under two search hypotheses. The estimator for a single experiment is 

E = C^. (1) 



Here C s+ b is the probability density function for signal+background ex- 
periments and Cb is the probability density function for background-only 
experiments. Because the constant factor C appears in each event's estima- 
tor, it does not affect the ordering of the estimators - an event cannot become 
more signal-like by choosing a different C. For clarity in this note, the con- 
stant is chosen to be e s , where s is the expected number of signal events. [] 
For the simplest case of event counting with no discriminant variables (or, 
equivalently, with perfectly non-discriminating variables), the estimator can 
be calculated with Poisson probabilities alone. In practice, not every event 
is equally signal-like. Each search may have one or more event variables that 
discriminate between signal-like and background-like events. For the gen- 
eral case, the probabilities C s+ b and Cb are functions of the observed events' 
measured variables. 

As an example, consider a search using one discriminant variable m, the 
reconstructed Higgs mass. The signal and background have different proba- 
bility density functions of m, defined as f s (m) and fb(m), respectively. (For 
searches with more than one discriminant variable, m is replaced by a vector 
of discriminant variables it.) It is then straightforward to calculate £ s +b 
and Lb for a single event, taking into account the event weighting coming 
from the discriminant variables: 

F _ pS P S+b rS e-^ b Hsf s (m) + bf b (m)} 

A e- b [bf b (m)] ■ {I) 

The likelihood ratio estimator can be shown to maximize the discovery 
potential and exclusion potential of a search for new particles ||. Such 
an estimator, both with and without discriminant variables, has been used 
successfully by the LEP2 collaborations to calculate confidence levels for 
searches 0. |3H . 



3 Ensemble estimator distributions via Fast 
Fourier Transform (FFT) 

One way to form an estimator for an ensemble of events is to generate a large 
number of toy Monte Carlo experiments, each experiment having a number of 

1 Whcn considering the two production hypotheses and calculating an exclusion, the 
expected signal s is uniquely determined by the cross section. If the cross section is not 
fixed, then e s is not constant, and C may be set to unity. 
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events generated from a Poisson distribution. Another way is to analytically 
compute the probability density function of the ensemble estimator given the 
probability density function of the event estimator. The discussion of this 
section pursues the latter approach. 

The likelihood ratio estimator is a multiplicative estimator. This means 
the estimator for an ensemble of events is formed by multiplying the indi- 
vidual event estimators. Alternatively, the logarithms of the estimators may 
be summed. In the following derivation, F = In E, where E is the likelihood 
ratio estimator. 

For an experiment with events observed, the estimator is trivial: 

p -(s+b) 

1 (3) 

(4) 

(5) 

where po(F) is the probability density function of F for experiments with 
observed events. 

For an experiment with exactly one event, the estimator is, again using 
the reconstructed Higgs mass m, 

p ^- {s+b) [sf s (m) + bf b (m)} 

E = e ^H ' (6) 

F = infM+W, ( 7 ) 

bfbirn) 

and the probability density function of F is defined as p\{F). 

For an experiment with exactly two events, the estimators of the two 
events are multiplied to form an event estimator. If the reconstructed Higgs 
masses of the two events are m,\ and 1712, then 

p = [sfjjni) + bf b (mi)] [sf s {m 2 ) + bf b (m 2 )} 

[bhirm)] [bf b (m 2 )} [ ' 

F = ln sfsijni) + bjbijrh) ^ sf s (m 2 ) + bf b (m 2 ) ,. 

bfb(mi) bf b (m 2 ) 

The probability density function for exactly two particles p 2 {F) is simply the 
convolution of Pi(F) with itself: 

P2(F) = JJ p l {F l )p l {F 2 )5{F-F l -F 2 )dF l dF 2 (10) 

= Pl {F)® Pl {F). (11) 



The generalization to the case of n events is straightforward and encour- 
aging: 
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Pi(F) <g> • • • ® /oi(F) . 



(12) 

(13) 

(14) 
(15) 



n times 



Next, the convolution of pi(F) is rendered manageable by an application 
of the relationship between the convolution and the Fourier transform. 

If A(F) = B(F) <g> C(F), then the Fourier transforms of A, B, and C 
satisfy 

A(G) = BjG) ■ C{G). (16) 

This allows the convolution to be expressed as a simple power: 



Pn (G)= Pl (G) 



(17) 



Note this equation holds even for n — 0, since po(G) = 1. For any practi- 
cal computation, the analytic Fourier transform can be approximated by a 
numerical Fast Fourier Transform (FFT) 0j. 

How does this help to determine p s+ b and pfl The probability density 
function for an experiment estimator with s expected signal and b expected 
background events is 



ps+b{F) = J2e 

ra=0 
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(18) 



where n is the number of events observed in the experiment. Upon Fourier 
transformation, this becomes 
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(19) 
(20) 



Ps+b{ 



(G) = e (^)[^i(G)-i] I (21) 



The function p s+ \ ) {F) may then be recovered by using the inverse transform. 
In general, this relation holds for any multiplicative estimator. 

This final relation means that the probability density function for an ar- 
bitrary number of expected signal and background events can be calculated 
analytically once the probability density function of the estimator is known 
for a single event. This calculation is therefore just as fast for high back- 
ground searches as for low background searches. In particular, it holds great 
promise for Higgs searches which, due to use of background subtraction and 
discriminant variables, are optimized to higher background levels than they 
have been in the past. 

Two examples will provide practical proof of the principle. For the first, 
assume a hypothetical estimator results in a probability density function of 
simple Gaussian form 

1 (^-m) 2 

P!*0 = -=e— =*-, (22) 

V 2na 

where a = 0.2 and /j, = 2.0. For an expected s + b = 20.0, both the FFT 
method and the toy Monte Carlo method are used to evolute the event es- 
timator probability density function to an experiment estimator probabil- 
ity density function. The agreement between the two methods (Fig. 1) is 
striking. The higher precision of the FFT method is apparent, even when 
compared to 1 million toy Monte Carlo experiments. The periodic struc- 
ture is due to the discontinuous Poisson distribution being convolved with 
a narrow event estimator probability function. In particular, the peak at 
In E = corresponds to the probability that exactly zero events be observed 
( e -(s+6) _ 2i x io~ 9 ). The precision of the toy Monte Carlo method is lim- 
ited by the number of Monte Carlo experiments, while the precision of the 
FFT method is limited only by computer precision. 

For the second example, a more realistic estimator is calculated using 
a discriminant variable distribution from an imaginary HZ — > Hrr search. 
The variable used here is the reconstructed Higgs mass of the event. This 
estimator's probability density function is then calculated for an experiment 
with s = 5 and 6 = 3 expected events (Fig. 2). Again, the two methods agree 
well in regions where the toy Monte Carlo method is useful. 
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Figure 1: The experiment estimator probability density function for a simple 
event estimator probability function calculated with the FFT method (solid 
red line) and the toy Monte Carlo method (dashed green line). Error bars 
associated with the Monte Carlo method are due to limited statistics. 
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Figure 2: The experiment estimator probability density function for an esti- 
mator based on reconstructed Higgs mass in HZ — > Hrr searches. The result 
from the FFT method is the solid red line, and the result from the toy Monte 
Carlo method is the dashed green line. 



These examples support the mathematical proof of the FFT method de- 
scribed above. Because the final calculations c s+ b and q, are simply integrals 
of the experiment estimator probability density function, any confidence lev- 
els calculated with the FFT method and the toy Monte Carlo method are 
identical. The examples also show the precision achievable with the FFT 
method, a precision that will be important when testing discovery hypothe- 
ses at the 5a = 5 x 10~ 7 level. 

4 Combining results from several searches 

Given the multiplicative properties of the likelihood ratio estimator, the com- 
bination of several search channels proceeds intuitively. The estimator for 
any combination of events is simply the product of the individual event es- 
timators. Consequently, construction of the estimator probability density 
function for the combination of channels parallels the construction of the es- 
timator probability density function for the combination of events in a single 
channel. In particular, for a combination with N search channels: 

N 

P^M = Uti+oiG) (23) 

= e £f=iOy+*y)p(^-i] (24) 

Due to the strictly multiplicative nature of the estimator, this combina- 
tion method is internally consistent. No matter how subsets of the combina- 
tions are rearranged (i.e., combining channels in different orders, combining 
different subsets of data runs), the result of the combination does not change. 

Once a results are obtained for p s+b (F) and Pb{F), simple integration 
gives the confidence coefficients c s+ b and c&. From this point, confidence 
levels for the two search hypotheses may be calculated in a number of ways 
P, [!| 0. Those straightforward calculations are outside the scope of this 
note. 



5 Final remarks and conclusions 

A few short remarks conclude this note and emphasize the advantages of 
calculations using the likelihood ratio with the Fast Fourier Transform (FFT) 
method. 
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1. The likelihood ratio estimator is an optimal ordering estimator for max- 
imizing both discovery and exclusion potential. Such an estimator can 
only improve the discovery or exclusion potential of a search. 

2. As a multiplicative estimator, the likelihood ratio estimator ensures 
internal consistency when results are combined. For example, if the 
dataset is split into several smaller pieces, the combined result always 
remains the same. 

3. The probability density function of an ensemble estimator may be cal- 
culated analytically from the event estimator probability density func- 
tion. Avoiding toy Monte Carlo generation brings revolutionary ad- 
vances in speed and precision. For a HZ — > 4-jets search with 25 
expected background events, a full confidence level calculation with 
2 18 toy MC experiments and 60 Higgs mass hypotheses takes approx- 
imately fifteen CPU hours. By contrast, the same calculation using 
the FFT method takes approximately two CPU minutes. This dis- 
crepancy only increases as the required confidence level precision and 
the number of toy MC experiments increase. For example, confidence 
level calculations for discovery at the 5a level would require O(10 8 ) toy 
MC experiments. Given the approximately linear scaling of calculating 
time with number of toy experiments, such a calculation would take up 
almost a year in the 4-jet channel alone! The precision of the analytic 
FFT method is more than sufficient for a 5a discovery. 

A fast confidence level calculation makes possible studies that might have 
otherwise been too CPU-intensive with the toy MC method. These include 
studies of improvements in the event selections, of various working points, 
and of systematic errors and their effects, among others. A precise calcu- 
lation makes possible rejection of null hypotheses at the level necessary for 
discovery. 

The marriage of the likelihood ratio estimator and the FFT method seems 
well-suited for producing extremely fast and precise confidence level results, 
and the flexibility and ease of use of the elf ft package should make this a 
powerful tool in interpreting searches for new particles. 
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